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I. Introduction 



We are interested in estimating the speed of convergence towards equilibrium for a finite 
and reversible Markov chain, a well studied problem in the theory of Markov chains, see 
[11] for instance. Most, if not all results in this direction yield bounds on the distance 
to equilibrium which are uniform with respect to the initial distribution of the chain. In 
this paper, we shall rather derive estimates on mixing times that take into account the 
dependence on the initial law. As an example of application of our method, we study the 
Metropolis dynamics of Derrida's Random Energy Model (REM.). 

Convergence times for the Metropolis dynamics of spin glasses were considered in [7] . Let 
us note that the present paper was done simultaneously with [7] and quoted therein as 
[11] with a slightly different title. In [7], estimates on the convergence time that depend 
on the initial law are given for models of spin glasses such as the REM or the Sherrington- 
Kirkpatrick model at high temperature. Three dynamics are considered: the random 
hoping time dynamics (RHT), the Glauber dynamics and the Metropolis dynamics. The 
initial configuration of the dynamics is always assumed to be chosen uniformly among all 
configurations. 

To compare the results obtained in the two articles, let us mention that the starting points 
of the present article and [7] are the same: the generalized Poincare inequalities that were 
introduced in [5] , see section II here and in [7] . However the way to estimate the associated 
constant C v (p), see (2.7) here and (2.2) there, are completely different. We will come back 
to this point later. 

Since two slightly different notions of convergence time are used here and there, we first 
note that in [7], the time called T w (c), is defined as in (2.8), with c playing the role of 
e. T w (c) a priori depends on the realizations of the energies as the u emphasizes. In any 
case, the initial law, rj, is uniform. 

Here, for the Metropolis dynamics of the REM, the results are given in term of a time 
denoted T N (e, c, rf) which is independent of the realizations of u, see (4.11). It follows from 
the definition (4.11) that on a subset £In of realizations of energies that has a probability 
larger than 1 — e~ cN we have 

T JV (e, C ,r ? )>T-(e) 
in particular this implies that, almost surely 

limsup -J- logT"(e) < lim sup — log Tjv (e, c, 77) (1.1) 

We now recall some results from [4] and [7] for the convergence time of the Metropolis 
dynamics of the REM. In [7], it was proven that for 77 the uniform measure on { — 1, +1}^, 
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we have, for almost all u 



limsup — logT w (e) < 2f3 2 when < C (1.2) 

and 

limsup — logT w (e) < 2/3/3 c when > C (1.3) 

(Remember that the free energy and the mean energy per site converge for almost all u as 
it follows from [8]). Note however that using the spectral gap estimates for the Metropolis 
Dynamics of the REM given in [4], we immediately get that, for all > 0, almost surely 
in u) 

limsup logT w (e) < 00 c (1.4) 

and by checking all the probability estimates in [4], we also have for all > 0, for all c > 

1 

hmsup — log (e, c, 77) < 00 c (1.5) 

Therefore, (1.2) gives an better estimate than (1.4) only for < c /2. For > c /2, (1.4) 
gives an better estimate than (1.2) and (1.3). 

Here we prove that for the Metropolis Dynamics of the REM, for all < C , 

limsup-^logTV^CT?)^ 2 (1.6) 

which together with (1.1) and (1.5) gives for all > a better estimate than (1.2) and 
(1.3). Thus we have improved the results of [7] in two ways: first we are using a more 
precise definition for the convergence time, second we won a factor 2 in the upper bound 

for < C . 

Note however that to get (1.4) or (1.5) a very careful analysis of optimization problems for 
paths on the weighted graph structure induced by the transition matrix of the dynamics 
was used. To prove (1.6), a similar analysis is needed. Thus, using the specific paths 
constructed in [4] , instead of techniques based on estimates of the partition function as in 
[7], leads, for the Metropolis dynamics of the REM, to an improvement by a factor 2 in 
the estimates. 

We believe that the bound (1.6) is sharp i.e lim^oo logTjv(e, c, rj) = 2 for < C . 
We also believe that a similar analysis could be carried over for the Glauber dynamics, 
but the numerical factor in front of 2 in (1.6) would then be different. As far as the 
Random Hoping Time dynamics is concerned, it seems that the techniques of [7] directly 
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lead to upper bounds of the correct order. Note however that the RHT dynamics has a 
much simpler structure than the Metropolis or Glauber ones. Indeed the RHT dynamics is 
nothing but a time-changed standart random walk on configuration space. The sequence of 
the different states visited by the process is independent of the Hamiltonian. On one hand, 
this feature very much simplifies the geometry. On the other hand, physicists believe that 
the evolution of the process should rather look like a random perturbation of the steepest 
gradient dynamical system. The RHT dynamics displays un-physical features. 

The organization of the paper is as follows: part II and II deal with general reversible 
Markov chains on a finite set. In part II, we define generalized Poincare inequalities 
and show that they control the decay of the semi-group (Theorem 2.1). Then we derive 
geometric estimates for the generalized Poincare constants (Theorem 2.2). Part III contains 
an application of these results in a case where the state space can be splitted into two 
components: 'good' and 'bad' points. For the reader's convenience, we decided to give 
self-contained proofs of our results at the risk of repeating arguments already used in [5] , 
[6] or [7] . 

Although we shall not directly use the results of part III to study the R.E.M., the strategy 
will be the same. Only technical aspects make the computation for the R.E.M. a little 
longer than the proof in part III. In part IV, we precisely define the R.E.M. and state 
our bounds for the thermalization time ( Theorems 4.1 and 4.2). Then we proceed to the 
proofs. In part V, we extend our results to the process of the environment as seen from 
the particle. This section is similar to the section 3 of [7] with more pedagogical details 
on the construction of the process. We then show that the equilibrium time also satisfies 
(4.6). Part VI contains the proof of some static estimates on the R.E.M. that we needed 
in the previous parts. 

II. Generalized Poincare inequalities 

Let X = (X t ) t >o be an homogeneous Markov process on a finite state space, X. We 
assume that there is a unique invariant, ergo die probability measure for X, say it. We 
further assume that it charges every point in X and that it is reversible. Let r\ be some 
probability measure on X and call C v (X t ) the law of X t when the initial law is 1]. We wish 
to bound drv (A/(^), vr), the distance in total variation between the law of X at time t 
and the equilibrium law n. More precisely, we would like to obtain an upper bound in 
terms of the geometry of the Markov process X i.e. in terms of the geometry of the graph 
structure induced by the transition matrix on the state space. 

It is well known that one can use Poincare inequalities to bound dyT(£ri(X t ), 7r). Indeed 
calling A the spectral gap of the generator of X (which is a symmetric matrix since we 
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have assumed that it is reversible), we have, for any real valued function / defined on X 
and for any t > 0, 

A{Ptf-Af)) 2 \<e- 2X \{f) (2.1) 

where P t denotes the semi-group i.e. Ptf(x) = E x [f(X t )]. From (2.1), it immediately 
follows that 

max xeX d TV {C x {X t ), tt) < */ — e~ xt (2.2) 

where C x (X t ) is the law of X t when the initial law is a Dirac mass at the point x G X 
and 7T* = min x6 ^ 7r(x). It now remains to estimate A in terms of the geometry of X . Such 
bounds exist, they rely on Poincare inequalities: assume that for some constant a > and 
any function / with iv(f) = 0, we have: 

Af) < a£(f, f) (2.3) 

then 1/A < a. Here £ is the Dirichlet form of X. From (2.3) one can deduce lower bounds 
of A in terms of optimization problems for paths on the weighted graph structure induced 
by the transition matrix of X on X (See [11] and the references therein). (2.2) might be 
sharp or not depending on X. Many efforts were recently made to improve (2.2). More 
precise bounds can be obtained replacing the Poincare inequality by more sophisticated 
functional inequalities such as Log-Sobolev, Sobolev or Nash inequalities. We refer to [11] 
for a detailed discussion of this topic. In all cases, one estimates max xe xdTv{^-x{X t ), n) 
i.e. the speed of convergence to equilibrium starting from the worst initial point. 
We look for estimates of cLtv (^(^t), ?r) that should depend on r\. This paper is an attempt 
to adapt the strategy of the Poincare inequality in this context: for each initial law r/, we 
introduce a family of functional inequalities, quite similar to the Poincare one, and prove 
that they allow one to control the distance to equilibrium. We call these inequalities 
generalized Poincare inequalities. We then derive geometric bounds for the constants 
involved in these inequalities in the spirit of [11]. 

Let (K{x,y), (x 7 y) G XxX) be the transition matrix of the Markov process X. Since we 
assume that the measure it is reversible, the kernel k(x,y) = K{x,y) /ti{x) is symmetric, 
i.e. k{x,y) = k(y,x). Let Ptf(x) = E x [f(X t )] denote the semi-group associated to X. 
For functions / and g defined on X, let 

9) == \ " f(y))(9(x) - g(y))k(x, y)n(x)n(y) (2.4) 

x,y 

be the Dirichlet form of X. For any edge e = (x,y) G XxX, let Q(e) = k(x,y)7r(x)7r(y). 
Also define d e f = f(x) — f(y). Then (2.4) can be re-written as 

£(f,9) = \ E Q( e ) d efde9 (2.5) 



5 



For p e]0, 1], let us define the following constants: 



£(p) = , ,,%>=„ » ( |/|)V» (2 ' 6) 

and, for a probability measure on A', say 77, 

. , . . . £(/,/)ii/nr 2 '" / '' ,„ 

^ = , ....%)- „(|/|)v, (2 ' 7) 

Clearly = C^ip). Holder's inequality implies that the function p — > ^(p) is decreasing 
and that C(p) > A for any p. (Remember that A denotes the spectral gap of the generator 
of X). 

To measure the time it takes for the process to reach equilibrium, we define the following 
quantities: 

G^(/j) = sup sup rj(\P s f -Tv(f)\) 
s>* /;||/||oo<i 

and, for any e > 0, 

T v (e) = m£{t > s.t. d v (t) < e} (2.8) 

Note that d,Tv(£r)(X s ),ir) < d v (t) for all s > t. 
Remark: let 

A(p) = inf 4vMh~ 

Then, as a consequence of Holder's inequality, A(p) < C{p) for p e]0, 1]. Also A(l) = C{1). 
The constants A(p) and C{p) have already been introduced in [5]. (In the notation of [5], 
C{p) is denoted K.(p/(1 — p),+oo)). It follows from the results of [5], that C{p) can also 
be defined in terms of the capacity associated to £ and different estimates of hitting times 
can be derived in terms of £(p)* We also have A(2) = A and A(p) > A for any p e]0, 1]. 
Because of the similarity of the definition of A(p) and the Poincare inequality, we call the 
inequality A(p) > a for some a > 0, a "generalized Poincare inequality", although there is 
no spectral interpretation. 

Theorem 2.1 : let p g]0, 1] and p' g]0, 1]. There exists a universal function of (p,p'), 
C PjP > , such that, for any probability measure 77 and any t > 0, 

d v (t) < C py C v (pT P ' /2 £(P)~ PP ' /{4 ~ 2p) t- p ' /(2 ~ p) (2.9) 
Cp,p' = e_p ^ 2 (p/(2 — p)) pp /^ 4_2p ) would do. As a consequence, for any e > 0, we have 

T v (e) < C p £r,(p , )- {2 ~ p)/2 £(p)~ p/2 £~ i2 ~ p)/p ' (2- 10 ) 



We take this opportunity to warm the reader that the results of part II in [5] are false. 
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where C p = e'^^ip/ (2 - p))?/ 2 . 

Proof: : we shall prove that, for any function / with n(f) = 0, then 

v(\Ptf\) < c p>p ,A ? (p')-p'/ 2 £(p)-« / /^- 2 rtrp'/P-'')||/|| 00 (2.11) 

with C PiP > = e- p '/ 2 (p/(2 - p ))pp7(4-2p)_ ( 2 .n) implies (2.9). 
S^ep i: define 

From Holder's inequality, we deduce that JC(p) > C{p). Let / be s.t. 7r(/) = 0. We claim 
that 

7r[(P t /) 2 ] < (^^)- p/(2 - p) (/C(p)t)^/( 2 -^||/|| 2 (2.13) 
Then (2.13) will also hold with K.(p) replaced by C(p). 

Proof of (2.13): let f(t) = 7r[{P t f) 2 ]. Then /' (t) = -2£(P t f, P t f). By definition of /C(p), 
we have: 

f(t) 2 /P 

f'(t) < -2/C(p) J{ A - r . 

\\Ptf\\£- 2p)/P 

Since Pt is a contraction in L^, we also have 

fit) < -2K(p) J ) A \, I 
\\f\\t- 2p)/p 

Integrating this last inequality, we get 

f(t) 1 - 2 /? > /(O) 1 " 2 ^ + 4 —^t ffij,. 



v \\f\\t- 2v)lv 

which implies (2.13). ■ 

Step 2: there exists a universal constant C s.t. for any function / and any t > we have 

S(P t f,P t f)<(C/t)n[f 2 } (2.14) 

(C = l/(2e) would do.) 

Proof: for all /i > and t > 0, we have fie~ 2flt < C/t. Use this inequality and a spectral 
decomposition of the Dirichlet form £ to deduce (2.14). ■ 




Using (2.14), the semi-group property: P t = Pt/iPtli-, an d the fact that P t is a contraction 
in Loo, we get that 



v(\Ptf\) 2/p ' < ^7r^T-(|P V2 /| 2 )||/||rV)/ P ' 



Using (2.12) ( with C{p) instead of JC(p) ), we get that 



V (\P t f\) 2 / p ' < 2C( 



4 - 2p /(2 _ p) 



P 



Remarks: 



(i) Depending on the concrete example under consideration, the sharpness of the bound 
(2.9) ranges from good to extremely bad. Let us just outline one example where Theorem 
2.1 leads to a very bad estimate: we consider the usual random walk on the discrete cube 
X = { — 1,+1} N . Then Q(e) = l/(N2 N ), for any edge between two nearest neighbours 
in X. Choose for r\ a Dirac mass, say r\ = 5 a . Using the test function / = 5 a — ?r(a) in 
formula (2.7), we get that, for large enough N : 



Therefore (2.10) would lead to the conclusion that the process reaches equilibrium in a time 
shorter than exp(ciV), whereas the true value of T v (e) is known to be of order iVlogiV. 
We will see with the R.E.M. an example where Theorem 2.1 leads to more interesting 
conclusions. 

There is one situation in which (2.9) is not so far from being sharp: assume that r\ = tt. 
Let a be such that, for any function / with ir(f) = 0, and for any time t > 0, we have 



<\Ptf\\< (^ll/ll 



(2.15) 



By interpolation, (2.15) implies that 



-[m/) 2 ]<(f)^ll/H 2 . 



(2.16) 



Use now the inequality 



7r[/ 2 ] - AiPtff] = [ M(Psf, Psf)ds < 2t£(P t f, P t f) 
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to get that 

*(f 2 )<2t£{P t f,P t f) + £)&\\f\\* 0O 



t 

Choosing the best value for t, we obtain the inequality: 

*(f 2 )<C p a*\\f\\*-*(e(fJ))* 

, where C p is some universal function of p. In other words we have proved that l/fC(p) < 
C p a, i.e (2.13) is sharp, up to multiplicative constants. 

(ii) We derive estimates of the eigenvectors of £ in terms of C(p). Following the terminology 
of [5], let us define 

K2(p) = , ,,%>= (2 ' 17) 

It follows from Proposition 1, Proposition 2 and Theorem 1 in [5] that, for any p' < p, 
there exists a constant C PjP ' such that J0(p) < C p ^ p iK,2{p')- 

Let now / be an eigenvalue of £ and <f> be the corresponding eigenvector. We assume that 
I 7^ (0 is not constant), and Tr[4> 2 ] = 1. Using / = <fi in (2.17) and 8{<p, 4>) = /, we obtain 
that £ 2 (p) < Replacing £ 2 (p) by £(p), we therefore have: 

4101) ^ C ^'^) P ' K1+P,) ( 2 - 18 ) 

for any p' < p. 

(2.18) implies that, if / is much smaller than C(p), then 7r(|0|) is small i.e. the function <j) 
is very concentrated on its support. Since / > A, where A is the spectral gap, this situation 
can occur only if, for some p, A << C(p). This will be the case for Metropolis dynamics of 
the R.E.M. at high temperature and we shall use (2.18) to prove that the first eigenvector 
of the dynamics is degenerate. 

Geometric estimates: a path 7 in X is a sequence of vertices 7 = (xq, ...,Xk)- Equiva- 
lently, 7 can be viewed as a sequence of bounds 7 = (e±, e^) with = Xi). The 

length of 7 is I7I = k. For x, y G X, let r(x, j/) be the set of all paths 7 = (xq, Xk) with 

= x and Xk = y and fe(xi_i, a^) 7^ for all % = l...k. For each x 7^ j/ G A', let us choose 
one path, say 7(x,y) G T(x,y). Since we have assumed that 7r is ergodic and charges all 
points in X , X is irreducible and therefore F(x,y) is always non empty. 

Theorem 2.2 : (%) Let p g]0, 1[. Let X(x) and /i(x) be two positive functions on X. 
We have 



A,(p) 



< 22/P- 1 (^TT^A^)^ 1 -^^^)^^^-^) 



^ Q(e) 1 ^ A(£Wy) J 

.e s.t. Q(e)/0^ V ; x,j/ s.t. e6 7 (x,y) V yrVy; 



(2.19) 
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(ii) 

/V 7 \e s.t. Q(e)#0 ^ V 7 x,y s.t. eEj(x,y) J 

Comments: let us recall from [11] the following estimate of the spectral gap: 

\ ~ e ..^Fe)^ W) ^ l7( "' ^ |7r(X)7r(?/) (2 ' 21) 

x, y s.t. ee~f(x,y) 

Proof: : (ii) follows from (i): choose X(x) = fi(x) = 1, and let p tend to 1. 

Let / be a function s.t. n(f) = 0. Note that f(y) - f(x) = J2 e e~f(x, y ) d ef- Therefore 

X yV (y)\f(y)\ =E yV (y)\f(y)\ - V x v{x)f{x) 
<Z x , yV (yMx)\f(x)-f(y)\ 
= E !B>y 7/(y)7r(x)|i: ee7 ( a . >y )d e /| 
<2 1 -^r ? (y)7r(x)|E e67(Xiy) 4/ni/||L- p 

= 2 1 ^||/||^S ;E)y r 7 (^)7r(x)[A(x) / u( 2 ;)|S e ^ ( ^ y) cZ e /|]^A(x)-^A(^)-^ 
We apply Holder's inequality to get 

x y v(y)\f(y)\ 

< 2 1 - p ||/||^ p (E X)y 7r(x)r ? (y)A(x)^( 1 -^^( ?/ )^( 1 -^) 1 " P (^y^^l^fl 

< 2 1 -"||/||^ (E X)y7 r(x)r ? ( 2/ )A(x)^( 1 -^^(y)^( 1 -^) 1 " P x 

v /V i . ,, v 7r(gMy) V 

X ^ e \a e J\^ y s . t . e^(x,y) x ^ x)M ) 

Applying once more Holder's inequality, we get 

x y v(y)\f(y)\ 

< ^WfW 1 ^ (s i)y7 r(x)r 7 ( 2/ )A(x)^( 1 -^^(y)^( 1 -^) 1 " P x ^ 

x (E e (4/) 2 Q(e)) P (^ e — (E x , y s . t . e6 7 (,, y ) A(x) ^ (2/) ) 2 

Replacing E e <3(e)|<i e /| 2 by 2£(/, /), we get the desired result. ■ 
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III. Applications 



This part of the paper mainly has a pedagogical aim. We shall illustrate how one can use 
the results of part II in a concrete situation. An even more concrete example of application 
will be given in the next part with the R.E.M. 

Comparing (2.20) and (2.21), one sees that the gain in using generalized spectral gap 
inequalities instead of the usual spectral gap inequality is that we can now afford having 
some "very bad sites" since we replaced a "max" over edges e by a sum. Besides formula 
(2.19) gives us the possibility of 'killing' these bad points by choosing A and \i. To illustrate 
the way it works, let us assume that the state space X can be divided into two disjoint 
sets, B and G. 'B' stands for 'bad'. Points in B are supposed to be pathological and we 
do not expect them to play any role on the speed of convergence when the initial measure 
is smooth enough. 

The next Theorem states a lower bound for C v (p) which is valid for any partition of X 
into two sets B and G, but (3.1) is useful only if, firstly, we assume that the measure of B 
is small both for 7r and rj and besides we also assume somehow that the hitting time of B 
is large i.e. the weights Q(e) for those edges e that touch B are not too small. 
Let us introduce some notation: 

7* = sup \n/(x,y)\ 

x,y€X 

B = {e G XxX s.t. there exist x and y s.t. e G "y(x, y) and x G B or y G B} 

In B are edges e G 7 (a;, y) with both x and y in B. 

Theorem 3.1 : for any p g]0, 1], for any probability measure n 

VU J e s -*- V^ e ^ U \ ^ V > xE G,yeG s.t. ee~((x,y) 



^£^)(^) 2/p +^) 2/p )> 



(3.1) 



Proof: : let p g]0, 1[. The proof for p = 1 is simpler and we leave it to the reader. Let us 
choose A and fi as follows: X(x) = fj,(x) = 1 for x G G, X(x) = 7r(S) 1_1 / p for x G B and 
fi(x) = rjiB) 1 - 1 ^ for x G B. Then 

^yr(x)A(x) p/(1 - p) 
=n(G) + 1 < 2 
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The same holds for J2v(y)Ky) 1/(1 ~ p) - Therefore 

1 <2 6/ P -5y 1 / V- ^) V (y) 2 ( o 2) 
C v (p) ~ ^Q(e) [ ^ X(xUyy [ ■ ] 

e Xj y g.t. ee-y(x,y) 

We compute the sum in (3.2) considering separately the cases (x, y) G GxG, (x, y) G BxB, 
(x, y) G GxB and (x 7 y) G BxG. Since A = \i = 1 on G, the first term is bounded by 

e ^ x,i;6G s.t. e£7(a;,i/) 

\ x,yEG s.t. eEj(x,y) / \ e X ,V^ X e67(x,y) 

^ ' x,y£G s.t. e6 7 (x,y) / \x,y 

\ ^ W x,y£G s.t. e6 7 (x,y) / 

The term corresponding to the case (x,y) G SxS is bounded by 

eeB ^ > x,yEB 



> 2/p + v(B) 2/p ) 

The term corresponding to the case (x,y) G GxB is bounded by 
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(3.3) 



^(E^))^( 5 ) 2/P (3-4) 



(3.5) 



Similarly the contribution of (x,y) G £?xG is bounded by 



Inserting these bounds in (3.2) leads to the statement of Theorem 3.1. ■ 



In the preceding Theorem, we chose the same 'bad' set for both measures n and rj. We now 
describe a slightly more sophisticated version of Theorem 3.1 obtained when choosing a 
different bad set for n and rj. Let us therefore assume that X can be split into the disjoint 
union of two sets B v and G v . B v might differ from B. We modify the definition of B 
accordingly: 

B = {e G XxX s.t. there exist x and y s.t. e G ~f(x, y) and x G B or y G B v } 

The proof of the following claim is identical to the proof of Theorem 3.1: 

Theorem 3.2 : for any p g]0, 1], for any probability measure n, any partitions X = 
B U G = B v U G v , we have 

i < 2 ^- 3 { 7 * S up (-]- < x )<y) 

5 m) (^) 2/p ^w 2/p )> 



(3.7) 




Proof: : choose X(x) = 1 for x G G, /i(x) = 1 for x G G v and X(x) = 11(B) 1 1 / p for 
x G S, /u(x) = ?7(i? T; ) 1 ~ 1 / p for x G -B,,. Then proceed as in the proof of Theorem 3.1. ■ 

Finally let us mention that even more elaborated bounds can be obtained: we could 
distinguish bounds in B linking sites (x,y) with (x, y) G GxB^, (x,y) G BxG and (x,y) G 
BxB v . We could also introduce 'weights' on bounds. We could choose a 'flow' of paths 
rather that picking a single path from x to y. If necessary, one can also use these three 
tricks at the same time. We refer to Chapter 3 in [11] for the notions of 'weights' and 'flow' 
or even 'generalized weights'. 



IV. Dynamical phase transition for the REM 

Before stating our result, let us recall the definition and some known facts on the R.E.M. 
Derrida's Random Energy Model: The REM was introduced by Derrida [1,2] as the 
simplest mean field spin glass. It is a caricature of the Sherrington & Kirkpatrick (SK) 
spin glass model [10]. Both are spin systems with Ising spins taking value ±1. In the SK 
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model one has Gaussian pair interactions, while in the REM one has Gaussian multibody 
interactions of any order. The Hamiltonian of the REM is 



aC{l,...,N} 

where the sum is over all the 2^ subsets of {1, . . . , N}, (J Q , a C {1, . . . , N}) is a family 
of i.i.d. standard Gaussian variables defined on a common probability space (fi, E, Q) 
and a a = Tlita&i with the convention that a® = 1. It turns out that the random variables 
H(a) and H(a') corresponding to different configurations a ^ a' are independent Gaussian 
variables with zero mean and variance N. The equilibrium statistical mechanics of the 
REM has been well studied, e.g., in a non rigorous way, in [1,2] and, in a rigorous way, in 
[3,8,9]. We quote some of the (rigorous) results that will be important for understanding 
the dynamics. Given (3 > 0, the inverse temperature, let us denote by 

Z N = Z N {f3) = Y j e- pH ^ ) (4.2) 

the finite volume partition function and by 

F N {(3) = -^\ogZ N {(3) (4.3) 

the finite volume free energy. 

It was proved in [8] that for all (3 > the limit lim/v^oo Fn(/3) = F(f3) exists Q-almost 
surely and in L^(0, E, Q) for 1 < p < oo. F((3) equals (3 2 /2 + (3 2 c /2 for (3 < (3 C and (3 C (3 
for [3 > (3 C , as expected from the results of [1]. F(f3) is therefore a non random function 
which is twice differentiable in (3 but the second derivative has a jump at /3 C = \J2 log 2. 
This is called in the physics literature a third order phase transition. Another important 
fact is that, depending wether we are in a high temperature regime (/3 < /3 C ) or in a low 
temperature one (/3 > [3 C ), not only does the free energy change from a quadratic function 
of (3 to a linear one but the difference between the finite volume free energy and its infinite 
volume limit is exponentially small in iV in the high temperature case, whereas, in the low 
temperature regime, Fn([3) — F(f3) behaves as C(oj, [3, N) lo fj N , for some random function 
C(uj,(3,N). C(ui,(3,N) converges in Q-probability to a non-random limit but does not 
converge Q-almost surely and the Q almost-sure cluster set of C(u, (3, N) was identified in 
[9]. 

Let us now discuss the dynamical properties of the model. We consider the Metropolis 
dynamics. (See (4.8)). A first step in the study of the dynamics for the REM was done in 
[4] . There the spectral gap, A at of the usual single spin flip metropolis dynamics in volume 
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N is studied. In particular it was proved that for all inverse temperatures (3 > we have, 
Q-almost surely 

lim -±-\og\ N = pp c (4.4) 

N^oo Jy 

Moreover Q-almost sure finite size corrections are also given in [4]: we have 

llos N 1 llosN 
PP C - cp^J <--log\ N < pp c + cp^J - JL- (4.5) 

Q-almost-surely, for all but a finite number of indices N, for some constant c. 
However one would have expected the dynamics to present a kind of transition as the 
previously mentioned static phase transition that can be seen on the free energy F(P). 
Such a dynamical transition is not seen on the spectral gap. 

Thus we are lead to the following question: how can we see a dynamical phase transition 
on the single spin flip dynamics ? 

The inverse spectral gap can be used as an estimate for the thermalization time of the 
dynamics. For the Metropolis dynamics, 1/Ajv is actually a sharp upper bound for the 
time it takes for the dynamics to reach equilibrium, whatever was the initial law. In 
particular we may consider the dynamics issued from a given configuration. The REM is 
rather pathological in the sense that the configurations of lowest energy ( of order —P C N) 
are surrounded (in a sense of a single spin flip) by configurations of energy of order at 
most ±y/N log N. The bounds in (4.5) follow from this fact. Starting the dynamics at a 
configuration of lowest energy, we have to wait for a time of order e N ^^° before the first 
spin flip. As we see, the time to reach equilibrium starting from a configuration of minimal 
energy is therefore of order e N ^^° . 

In the low temperature regime, P > @ c , the equilibrium measure is concentrated on these 
configurations of minimal energy. But in the high temperature regime, P < P c , the invari- 
ant measure does not charge too much these configurations with minimal energy. In fact 
the invariant measure has its mass concentrated on configurations with energy of order 
~PN. This follows from results in [8]. In a certain sense, when P < fi c , it is therefore 
'un-naturaP to compute the thermalization time starting from a configuration of minimal 
energy. 

We shall therefore change our point of view: instead of considering any initial law, we 
shall rather estimate the time to equilibrium when the dynamics starts from the uniform 
probability. Doing this we expect the dynamics to avoid the configurations of minimal 
energy ( in the high temperature regime), and thus we hope to see a dynamical phase 
transition. 
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Using generalized Poincare inequalities, we get upper bounds for the time to equilibrium 
starting from the uniform law, say T N . We prove that, when f3 < f3 c , then 



limsup^logTV <f3 2 (4.6) 

Comparing (4.6) with (4.5), one sees that the thermalisation time is much shorter than the 
inverse spectral gap. In other words, in the high temperature regime, starting from the 
uniform law, the dynamics reaches equilibrium much faster than starting from one of the 
configurations of minimal energy. These results can be interpreted as a first step towards 
a proof of the existence of a dynamical phase transition. Actually we expect (4.6) to be 
sharp i.e. we expect -^logT/v to converge to /3 2 , for all (3 < f3 c . In the low temperature 
regime, the asymptotics of T/v should be given by the inverse spectral gap i.e. one expects 
jj- log Tn to converge to — (3(3 C for all (3 > (3 C . Thus one would see the dynamical phase 
transition for the Metropolis dynamics. 

Remember that the Hamiltonians H(a), a £ { — 1, +1}^ of the REM form a family of i.i.d, 
Gaussian random variables with mean zero and variance N, defined on some probability 
space, say (O, E, Q). Given j3 > 0, the inverse temperature, the Gibbs measure is defined 
by 

-f3H(v) 

= —y (4-7) 

ZjN 

where Zn is defined in (4.2). For a given realization of the Hamiltonian, we consider the 
Metropolis dynamics, X(t) = Xjv(t): X(t) is the continuous time Markov process defined 
on X = {—1, +1}^ by the transition rates: 

P(a, a') = | j eM-P(H(*') - H (<,))+} if j W - a\ j = 1 (4 g) 

where a + = max{a, 0} and = \ *Ym=\ \ x A- n f3 ^ s invariant, ergodic, and reversible for 
this dynamics. 

The associated Dirichlet form on L 2 (X, up) is given by 

W,g) = 2N z N ((3) " f(y))(9(x)-g(y))e-^ H ^ H ^ (4.9) 



With the notation of part II, 
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for e = (x, y) with \\x — y\\ = 1. 

From (4.5), one deduces that for any fixed initial law 77, and any 7 > /3/3 c , then, Q.a.s. 



rf T y(^(X(e^)),7r /3 )^0 



(4.10) 



From now on, we assume that (3 < (3 C . Given a probability measure n on X, and t G iR, 
let be the law of the process at time t starting from the initial measure 77. 

Given e > 0, c > and a probability measure 77, we define the time Tjv(e, c, 77) to reach 
equilibrium starting from 77, up to e on a subset of Q-probability greater than 1 — e~ cN by 



T N (e,c,rj) = mf\T>0: Q 



sup d T v (Lr,(X(s)), ixp) < e 

s>T 



> 1 



-cN 



(4.11) 



The main result of this section is 

Theorem 4.1 Let rj be the uniform probability measure on X . Then for all c > 0, 
e > and for all (3 < (3 C 

1 



lim sup ^- log T N (e, c, 77) < /3 2 

N]'oo 



(4.12) 



We can also prove estimates when e goes to as N — > 00. We consider two cases: s going 
to polynomialy and stretched exponential. 

Theorem 4.2 Let 77 be the uniform probability measure on X . There exists a constant 
ci > 0, such that for all c > 0, there exists a constant Co = Cq(c, j3) such that 



Uo g T N (e- Nl/ ^ 3/ \c^) 



1 AT\^ 1 AT (4 - 13) 

logiV\ log iV 3/4 

~1T) +Co( nv _) 



where 



c 2 (P, c)=0 (l2/5/5 cV /ci(l + c)) + - 
Moreover for all 5 > 



V2 1 02 + ^2 



/?/? cV /ci(l + c) 



1 logT^CJV-*, c, 77) < /3 2 + /3 (l2/3/W Cl (l + c)) V * (^) 1/4 



+ 2/3/3 c ('^ 1+ fj lQSiV V /2 + 1/32 + ^ ' 



AT 



4 /3/3 c ^/ Cl (l + C ) v iV 



logiV 1/2 + c7o _logJV 



AT 



(4.14) 



(4.15) 
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As a corollary, we get 

Corollary 4.3 Let r\ be the uniform probability measure on X . For all 7 > (3- 
Q-probability 1, for all but a finite number of indices N 



' 2 , with a 



d TV {C v (X(e^ N )),7rp))<e 



-N 



/4 (logA0 3 / 4 



(4.16) 



Moreover for all S > 0, if 



t N = exp[/3 2 iV + v/TmA^cHog AO 1 / 4 + (2(3(3 C + 5((3 2 + (3 2 c )){ Cl N\ogN) 1 ' 2 } (4.17) 



The error terms in the bound (4.13) have no reason to be optimal. However in (4.5) the 
order of magnitude of the error terms are optimal as it was observed in [4]. 

To prove the theorems we will need estimates for the constants, £ n/3 (p) and C-qip) using 
(2.19). This will be done now and the result will be collected in the Proposition 4.9. 

These estimates will also depend on the choice of paths 7(2;, y). To estimate the spectral 
gap, the following set of paths was introduced in [4] and they work also here: given 
% G {1, . . . , N}, and x, y G X, such that Xi 7^ yi let 7*(x, y) be the path starting at x and 
ending at y obtained by flipping the disagreeing spins, starting at the site i and then going 
cyclically. Let T l = {^ l (x,y),x,y G X}. Given x,y and 7(2:, y), let 7(2;, y) be the set 
of points visited by the path and 7°(2;, y) = 7(2:, y) \ {x, y} the set of the interior points 
of the path. Note that if the number of discrepancies between x and y is n then there 
exist n interior disjoint paths in {^ l (x,y),i = 1,...,N}. This comes from the fact that 
if ii, . . . ,i n are the n sites where x and y disagree, then the paths 7 ll (2;, y), . . . , 7*" (2:, y) 
are interior disjoint. The set of paths we will construct will depend on the realization of 
H(x): it is a random set. Given a positive number c e , we will say that a point z is good if 
H(z) < a/(1 + c e )2iVTog N. Call G the set of good points. If z is not good, we call it 'bad' 
and write B for the set of bad points. A path is good if all its interior points are good. 
Note that we need to select a path for any pair of points (x, y), and the typical number of 
bad points is of order 2 N 2~^ l+c ^ log N . We cannot neglect good paths 7(2:, y) with bad end 
points x or y or both. We construct the set of paths V according to the following rules: 
For ||x — y\\\ > lQ g N , if there is a good path in {7^2:, y), i = 1, . . . , iV}, choose the first 
and put it in T ; otherwise, choose ^(x.y). For \\x — y\\i < lQ g N , if there exists a good 
site z in X such that \\x — z\\ > lo ^ N , \\y — z\\ > lo ^ N and if there are good paths, one in 
{Yi^i z), i = 1, . . . , iV} and another in {^{z, y), i = 1, . . . , iV} such that the union of these 
two good paths is a self avoiding path, then we select this union as the path connecting x 



then 



1 



(4.18) 
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and y in V. (Note that this is a good path since z is good); otherwise, select 7 1 (x, y). Note 
that all the paths constructed in this way have length smaller than N. A fundamental 
result that can be easily proven by keeping the Q-probability in the proof of the proposition 
4 .1 in [4] is 

Proposition 4.4 For all c e > 0, there exists No(c e ) such that for all N > No(c e ), 
with a Q-probability > 1 — e~ CeN , all the paths of the previous set V are good i.e. satisfy 
H(z) < a/(1 + c e )iV log N, for all z G 7(x,y) \ (x,y), for all (x,y) G X 2 . Moreover they 
have a length smaller than N. 

We say that an edge e = (x, x') is good, if x and y are good, this will be denoted by e G Q, 
otherwise the edge is bad: e G B. Note the important fact that, with our construction, a 
given edge e = (x,x f ) belonging to V can have at most one bad point among x and x' . 

Let us first estimate, C n/3 (p), see (2.19). The weights X(x) are chosen in the following 

way: Let d be such that (3 < d < f3 c , to be chosen later. We set d = (3{l + Q with 
< C < (/3 C - P)/P- Let 

X(x) = \\ *B{x)>-dN (4ig) 



where 



A otherwise. 



kxex 



(4.20) 



(Z N (p,<-d)) p 



for some p > to be chosen later. 

First we consider the first term in the right hand side of (2.19) the other ones will be 
treated later. Let us denote 

R(d, P , P )=J2 ■Kp{x){\{x)Y' 1 -v (4.21) 
Lemma 4.5 Let ( > 0, < p < 1 and < p < 1/2 that satisfy < ( < (f3 c - (3) / '(3 and 

W <\J^ (4.22) 



C 2 (l-p) " 2 /3 2 + /3, 



There exists an absolute constant c\, and, for any c > 0, there exists N (f3,cX) such that 
for any N > Nq((3, c, () such that 



N 123 c ^_ 

> t^v^T (4.23) 



(1 + c)logA^ - C 2 I3 
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then, with a Q-probability> 1 — e cN , we have 



R(d,p,p)<2 



(4.24) 



where d = (3(1 + £) 
Proof: Let us denote 



(4.25) 



and R(d,p,p) = R 1 (d,p,p) + R 2 (d, p,p). We have R 1 (d,p,p) < 1 and, to estimate 
R 2 (d, p,p), we use the following lemma that will be proved in the section VI. 

Lemma 4.6 There exits a constant c\, such that for all c > 0, there exists a Nq(/3, c, () 
such that for all N > Nq({3, c, () with a Q-probability > 1 — e~ cN 



<? , Pi 



Z N (/3,<-d) < e PP^^l+c)NlozN e N[f3d-- + ^] 



(4.26) 



and 



P 2 , Pi 



Z N {P) > e - p ^ cl{1+c ^ N log N e N[ 2+^ 



(4.27) 



Note that 



R 2 {d,p,p) 



Z N ((3) 



Therefore Lemma 4.6 implies that 



R 2 (d,p,p) < e l 2 +^WW^+^^" e -N[t^] e N^[Pd-f + §] 



(4.28) 



Now using (4.22), we get 



PP 
1-p 



d 2 3 2 



< H_d-py 

~ 2 2 



(4.29) 



Using < p < 1 and < p < 1/2, we have ppj (1 — p) < 1 therefore (4.23) implies that 



2 + 



PP 
1 — p 



(3(3 c Vci(l + c)NlogN < \ { ^^N 



(4.30) 



from which we immediately get (4.24). 
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Now we estimate the other term in the right hand side of (2.19). Let us denote 



T- 

^ r>(, 



Q(e) 



E 



x,y:j(x,y)Be 



X(x) X(y) 



(4.31) 



Proposition 4.7 We assume that 2(1 — p) < 1. 
There exists a constant c\, such that for all c > and all £ > 0, £ < {j3 c — (3)/ (5, 
satisfying (4-22), there exist No(fl, c, () such that for all N > No((3, c), with a Q-probability 
> 1 — e~ cN , we have 



1 



2£*,(1) 



< 



22N 4 e 2ppc V c * (1+c) N log 



(4.32) 



Proof: Let us write 



2^(1) 



(4.33) 



where Lg is the same as (4.31) but with the sum Y^ e restricted to good edges and Lg with 
bad edges. 

Let us first consider Lb- Using convexity and symmetry,we can write 



£s<3L B (>,>)+6L B (>,<) 



(4.34) 



where Lg(>, >) is the same as in (4.31) but with the following restrictions: e E £>, H(x) > 
—dN,H(y) > —dN. Lg(>, <) is defined similarly. 

Let U = {x ; H(x) > —dN} and D = {x ; H(x) < —dN}. Since a bad edge is the first or 
the last edge of the path, if e = (z, z') E 7(2;, y) is a bad edge then we have either z E B 
and x = z or z' E B and z' = y. By symmetry it is sufficient to consider the first case. 
Then 1/Q(e) = NZn(/3) exp(/3H(z)). Note in particular that it is not possible to have 
e = (z, z') E 7(3;, y), e bad and both x and y belonging to D. This is the reason why we 
do not have a term Lg(<, <). 



L B (>,>) < 



2N 

e=(2,z'j 



E 



2iV 



E ■ 

e=(z,z') 



-{3H(2 



-1 2 



x,yEU xU 



-/3H(y) 



(4.35) 



< 2N 



Using similar arguments and recalling (4.20), we get 
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i8( -' <) -^03) £ e 

e=(z,z') 



< 



< 



2iV 
2N 



e=(z, 2 ') 



E «■ 

x,y€UxD 
yED 



n 2 



Z p N {(3,<-d) 



lZ N ([3,<-d)] 2{1 - p) 



(4.36) 



Using (4.26), (4.27), and 2(1 - p) < 1 we get 



Ab(>, <) < 2iVe 3/3/3c V c i( 1 + c )^ 1 °g^ e - 7V[ 



£ 2 +/3g+(rf-/3) ; 



r /3 2 +/3 c 2 



< 2iVe- Ar [ £ -^] 

< 2N 



(4.37) 



where we have used (4.23) at the second step. We have proved that 



L B < 18N 



(4.38) 



We consider now Lg As before, using convexity and symmetry, we write 



Lg < ALg{>, >) + 8L g (>, <) + 4L e (<, <) 



(4.39) 



We first consider Lg{<, <). Since for a good edge e = (z,z f ), we have H(z) V H(z') < 
+ c e )iV log AT, we therefore get 



N e Py/(i+c e )Nio g N ^ 

M<,<) < ^T77^ 22 



E i 

x,yEDxD 



e -(3H(x) e -(3H(y) 



{^,y)^} z e N (p,<-d)ze N (p,<-d) 



(4.40) 



On one hand we have 



-f3H(x) 



-pH(y) 



* JL I{7(x ' y)E5e} < -d) Z>M < -d) 



<zf~ p \p,<-d) (4.41) 
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On the other hand we have 



e -/3H(x) e -f3H(y) 



e -0H(x) e -/3H(y) ( 442 ) 

^ zUl, < -d) zap, < -d) \ 

x,y€DxD N\r,- ) JVV^> - ) eE g 

<NZ 2 N {1 - p \(3,<-d) 



where at the last step we have used that the length of a path is smaller than N. Therefore 
using (4.41) and (4.42) in (4.40), then (4.26) and (4.27) and at last 4(1 - p) < 2 < 3 we 
get 



Lr(< <) < N 2 e l3 y /{1+Cs)NlosN Z ^ 1 1^1=. Q 

< Ar 2 e /3 v /(l+^)^logjV e 6/3/3 eV /c 1 (l+c)jVlogjV e -jV 3(d ~ /3) ( 4 ' 43 ) 

, 3(d-/3) 2 

< Ar 2 e 7/3/? eV /ci(l+c)JVlogjV e -iV < ^2 



Where we have used f3 c = \J2 log 2 > 1, (4.23) and we have chosen c e = c and c\ > 1. 

Consider now Lg (>,<). Using exactly the same kind of arguments, using (4.26) and 
(4.27), and 2(1 - p) < 1 we get 



Lg{>, <) < N ^y/ci(i+c)Ni OS N 



z% {1 - p \s,< 



d) 



< Ar 2 e 3/3/3 cV /ci(l+c)iVlogAr ( 



z N (p) 



(4.44) 



where at the last step we have used (4.23). 
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We consider now Lg{>, >). Since the edge is good, we have 



Ne ^(l+c e )Nlo S N 
L o(>, >) < ^TTm E 



-l 2 



V Ir / n , p -f3H(x) -f3H(y) 



< 



iVe /3 v /(l+Ce)iVlogiV 



sup 

e 



*E 



E * 



x,yEU xU 

E e 
l{ 7 (x,y)3e} — 

x,y€UxU 

e -/3H(x) e -pH(y) 



-f3H(x) e -(3H{y) 

Zn(P) 



{l(x,y)Be} ' 



x,yEU xU 



(4.45) 



E i 



{7(a;,y)3e}' 



x,y€UxU 

To continue we will need an adaptation of [4]. Let us call 



-pH(x) e -f3H{y) 

Z N {I3) 



A(d) = sup 



E 1 

x,yeU x U 



{l{x,y)3e}- 



e -PH(x) e -0H(y) 

Z N (/3) 



(4.46) 



Recalling that the paths in T are constructed using paths in U^P, we get immediately 



A(d) < N sup A {l) (d) 

Ki<N 



(4.47) 



where A^\d) is as in (4.46) but with paths in T l . It is enough to consider the case i = 1 
the other ones being similar. Now for a given edge e = (z, z'), there exists a j e {1, . . . , iV} 
such that z' = z- 7 , that is z' is the configuration obtained from z by flipping the spin at 
the site j. Note at this point that the set of all (x, y) : 7(2;, y) 3 e for 7 e T 1 is exactly 

(J IJ (fai, • • • , Xj-i, . . . , zjv), (zi, . . . , zj-i, -Zj,y j+1 , y N )) 

X€{-1, + 1}J-1 y6 {-l, + l}iV-J 

(4.48) 

Denoting z >i = (z j+1 , . . . , z N ), z Kj = (z 1 , . . . , 2^-1), 



a;e{-l,+l}j-i 



(4.49) 



and 



4-,-(ft>-d)[*<y,-*i] 



E e-^^'- 2 ^)l { H (2<J ,- 2 ,,,)>-^ } (4.50) 
tf e{-i,+i}"-i 
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we get immediately: 



A (1) (d) < sup -!—zf\{^ > -WzitZ^zfiLtf, > -d)[ Z<j , - Zj ] (4.51) 

To continue, we need the following lemma that will be proved in the next section. 

Lemma 4.8 There exists a constant c\ > 0, such that for allc > 0, if c u = c^~ 1 (21og2 + 
c), then we can find an N = Nq((3, c) such that for all N > Nq((3, c), with a Q-probability 
> 1 - e~ cN , if we call M = y /N/log 2 (Nc u ) and j - 1 = aN then 

sup Zf} x {p, > -d)[ Zj , Z>j ] < ^Ne fi ^ Nlo ^ N) {V + Z N {(3, d, a)) (4.52) 

z j ' z >j 

where 

if VaM 2 - 1 < l-M 

Pc 

if j-M < VaM 2 - 1 < j-M (4.53) 
if 4-M < %/aM 2 - 1 

Pc 

Now inserting (4.53) for j — 1 = aN and N — j = (1 — a)N in (4.51), considering the nine 
resulting terms, using 

(3 d<(3( 3 c< P_ + P± (4.54) 
to simplify the computations, and maximizing over a G [0, 1], it is just a long task to get 

A (1) (d) < VNe 13 ^ ^ N log2 (Cu N) e pdN 

(4.55) 

< v / iVe /3/3c ( 1+C ^ N lo s N e P dN 
with a Q-probability > 1 — e~ cN . Inserting (4.55) and (4.47) in (4.45), we get 

Lg(>, >) < N ^oVci(l+c)Nlo S N e /3dN (456) 

Using (4.33), (4.38), (4.43), (4.44) and (4.56) we get (4.32). ■ 

Now we estimate C v (p) see (2.19), when n is the uniform measure on X. We take the 
weights [i{x) = 1. Since we have already estimated the first factor in Lemma 4.5, it 
remains to estimate 



Z N ((3,d, a) 



( e f3dN 

e f3dN _j_ g^I^+a-f ] 

B 2 2 
e N[^+ a ^] 
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XI) 



>c;(i) 



Q(e) 



-l 2 



E 



7T/?(a;) 1 
X(x) 2 



(4.57) 



Proposition 4.9 There exists a constant c\, such that for all c > 0, for all C > 0, 

C < (Pc — 0)1 'ft, there exists Nq((3,c,(), such that for all N that satisfy (4-23) and are 
larger than Nq((3, c, (), with a Q-probability > 1 — e~ cN , 



1 



2£;(i) 



< 



AN 2 e ^l3 c Cl (l+c)N log N e f3dN 



(4.58) 



Proof: As before by considering separately the cases where e G Q and e G £>, we write 



2£;(i) 



= L B ( V ) + Lg( V ) 



(4.59) 



Distinguishing bad and good edges and separating the e D or i 6 (7, we get four 

terms that we call, Lb(t), >), Ls{j], <), Lg(r), >) and Lg{r], <). 

Let us start with £5(77, <). We should then have y = z' G B. Therefore 



Lb{v,<) < 



< 



2N 

Z N (P) 

2N 
2N 



E < 

e=(z,z') 



pH{z') 



-1 2 



^ Z p ((3,<-d)2^ {y=z ' } 



Z^- p \(3,<-d) 



z N (P) 



Zn(-P) 



2 N 

zf- p) (^<-d) 

2 2N 



(4.60) 



Now since it is clear that Zn{(3) and Zn(— f3) have the same distribution and therefore 
satisfy the same estimates, we get that with a Q-probability > 1 — e _cA , 



Z n(~/3) < 2/3/3 cV / Cl (l+c) 

Z N (0) ~ 



NlogN 



(4.61) 



Using now (4.26), and 2(1 — p) < 1 we get 



Lb(v, <) < 2Ne spf3c ^ Cl(1+c)Nlog N e N [ 2 P d - d2 \ 
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(4.62) 



Consider now Ls(r}, >). Then e is bad and x e U. We have to deal separately with, case 
1, x = z e -Band, case 2, y = z' E B. By convexity 



^sfo >) < 2L B (?7, >, 1) + 2L B ( V , >, 2) 



On the one hand we have 



L B (V,>,1) < 



N 



Z N {(3) 
= N 



-/3H( 2 ) 



On the other hand we have 



N 



Zn(P) ^ 



E 1 



.x€I7 



< 



AT 



Z N (f3) 



Zn(-P) 



Z N {(3) 

2 N 



< Ne Al3pc ^/ Cl{l+c)N log N e /32N 
Collecting (4.62), (4.64) and (4.65), we get 

L B (rj) < 2Ne 4 ^V^( 1 +^ N ^N + e [2pd-d 2 ]N^ 



< 



4 j\r e 4/3/3c \/ci(l+c)N log N e 0dN 



where, at the last step, we used that d> {3 and therefore 2 /3d — d 2 < /3d. 
Consider now Lg(r), <). Since we consider now good edges, we have 



Ne 0y/{l+c B )Nl Of rN 
Lg( V , <) < — — J2 



Z N {(3) 



e -f3H( X ) 

2s Z P N {(3,< -d) 2 



N 



< 



N /3y/(l+c e )NlogN 

-z%- p \p<-d)Y: 



Z N {(3) 



-f3H(x) 



Z P N {(3,< -d) 2 N 



N 2 py/(l+c e )Nlo S N 



< N 2 e 3/3p c y/ Cl (l+c)N log JV g - \ (d-p fN 

< N 2 



(4.63) 



(4.64) 



(4.65) 



(4.66) 



(4.67) 
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where we used that 2(1 — p) < 1 and (4.23) at the last step. 

It remains to consider Lg(r], >). Using the fact that e is good, we get 



N J3y/(l+c e )NlogN 



-\ 2 



E 



-f3H(x) 



2 N 



< 



Ne Py/(l+c e )N log 



N 



sup 



E 



-f3H(x) 



2 N 



X 



—y 



eeQ 



E 



-f3H(x)_ 



x£U,y,-f(x,y)Be 



2 N 



< N 2 e l3^(l+c e )N\ogN 



sup 

eeG 



E 



7 -(3H(x) m 



xEU,y,~i{x,y)3e 



2 N 



(4.68) 



To estimate this last supremum, we use a similar argument as the one we used to treat 
(4.51). Using (4.52), and the same notation as in (4.49), after a not too long computation, 
we get that with a Q-probability > 1 — e~ cN 



A(?7, d) = sup 

eeQ 



E 



xEU,y,j(x,y)Be 



-pH(x)J_ 
2 N 



< SUp SUp Zj-i((3, > —d)[zj, Zyj]2 3 

i<j<N zex 

< e f3f3 cy /N\og 2 (c u N) e f3dN 



(4.69) 



Collecting (4.68) and (4.69), we get 



Lg( V , >) < N 2 e 2 ^y/ c ^ 1+ ^ NloeN e^ dN 



(4.70) 



Collecting (4.66), (4.67) and (4.70), this entails (4.58). ■ 

Now we put together all the results concerning the quantities £ v (p) and C^p- That is 
collecting Lemmata 4.5 and Proposition4.7 and 4.9, recalling (2.19) we have 
Proposition 4.10 Let (3 < (3 C , < C < (Pc ~ P)/P and < p < 1/2 satisfy 



P 



< 



1 



C{i-p) /3 2 + A : 
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There exists an absolute constant c\, such that for all c > 0, there exists a Nq(@,c,£) such 
that for all N > Nq(P,c,Q and N satisfying condition (4-23) then, with a Q-probability 
> 1 — e~ cN , we have 



Y l-3p+y 



2 



<4 p(i-p) 22iV 4 e 2/3/3c V /c i( 1 + c ) 7V1 °g Ar e /32(1+c)Ar (4.71) 



£^ (p) 

and 

1 < 4 " W=p) 4N 2 e ^Pc V c i (!+ c ) ^ log (i+C) ^ (472) 



2-3p+2p 2 



Remark: the aim of this remark is to discuss the implications of Proposition 4.10 as far as 
the behaviour of the eigenvectors of the Metropolis dynamics are concerned. To simplify 
things, we only consider the almost sure asymptotics of the first non trivial eigenvector: 
assume that we have constructed the Hamiltonians H(a) corresponding to the different 
values of iV on the same probability space, and fix one realisation. From Proposition 4.10, 
we then know that, 

limsup ^ log z^)^ 2(1+c) (473) 

Let now A denote the spectral gap of £. A depends on the realisation of H and on N. And 
let ip be the corresponding eigenvector. We assume that Trp(ip 2 ) = 1. From [4], we then 
know that 

limllogi = /9/? c (4.74) 
Therefore, provided we choose ( small enough, we will have 

lim sup -J- log A < -a (4.75) 

, where a > is a deterministic constant that depends on (3. It then follows from (2.18) 
that 7173d -01) < exp(— aN) for large enough N and with possibly a different value for the 
constant a. In other words the eigenvector ip becomes concentrated on its support. As a 
matter of fact, this is only another way to understand the fact that thermalisation times 
depend a lot on the initial law: eigenvectors corresponding to low eigenvalues become 
singular. 

Proof of Theorem:4.1 recalling (2.10), (4.24), (4.32) and (4.58), we get 



1 , m , x 1 , ^ 2-p, 1 41ogN 
- log7V(e, cry) < - lo g C p + ^flo g - + 

1 /2 

+ 2PP c L(l + c) l ^pj +/3 2 (1 + C) 



(4.76) 
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where C p is the constant in (2.10). (Remember that d = (3(1 + £)). Now taking first the 
limit A" | oo, we get 

limsup 1 log T N (e, 0,77) < /3 2 (1 + C) (4-77) 

(4.77) is satisfied for all C > 0. (Just choose p small enough so that (4.22) is satisfied). 
Therefore ^ 

limsup — log TV (e, c, 77) < (3 2 



Proof of Theorem:4.2 

The proof is a little more involved than the previous one. Choose 



p = 3/4 and 



log - = AT 1 / 4 (log iV) 3/4 



C 2 = 12|(c l( l + c)^)V 2 



V _2 /3 2 2 



1-p 3/32 +/3 
Then (4.22) and (4.23) are satisfied. Also 

2 1 f3 2 + (51 1 



V 4 /3/3 c ^1(1 + c) V^logiV 



and we deduce the upper bound (4.13) from (4.76). The proof of (4.15) is similar, with 
now log(l/e) = JlogAf. ■ 



V. The Medium from the point of view of the process 

In this section, we shall consider the process of the environment as seen from the particle. 
This process will be denoted by uj t . For any fixed N, let Sn = { — 1, +1}^- We endow S n 
with its natural group structure i.e. for a, a' G Sn, we let 0.0' G Sn be the configuration 
(cr.a')i = (7ia[. Let 1 be the configuration = 1 for all i. For 1 < i < N, we also define 
i to be the configuration whose i-th coordinate is —1, and the other coordinates are +1. 
Thus a.i is the configuration obtained by flipping the i-th coordinate of a. 
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Without loss of generality, we may, and will assume that our random Hamiltonian H is 
defined on the canonical space O = IR Sn . Q is therefore the centered product Gaussian 
probability on D, of variance N. By duality, Sn acts on Q through the rule (a.h)(a') = 
h(a.a'), where a, a' G Sn and h GO. 

For each choice of H G O, let us denote by X H the Metropolis dynamics with Hamiltonian 
H, i.e. X H is the Markov process with generator 



here, as before [x] + is the positive part of x. We denote by P t H = e tL , its semi-group, 
and let IE* be the law of X H when X H (0) = a. 

Let us now define the stochastic process u) t = Xf.H. The state space of is O. uj t is 
simply the Hamiltonian translated according to the position of the particle. For instance 
note that, by definition, cut(l) = Xf -H(~5L) = H(Xf) is nothing but the value of the 
Hamiltonian evaluated at the position of the particle at time t. We consider the canonical 
construction of the Markov process, Xf 1 , so we call X t the coordinate process on the space 
of cad-lag functions taking value in Sn. We call IE^ , the law of the Markov process with 
generator L H starting from a. We denote , the law of the process ujt = X t .H when X t 
is distributed according to IE^ . 
By definition we have 



L H f(a) = ±y^ e -f3[H(i.*)-H(<r)] 



N ^ 

i=i 



(Hi-*) ~ fi*)) 



(5.1) 



(5.2) 



The point is the following 
Lemma 5.1 




a.H 



(5.3) 



Proof: 



Let 4> be some measurable function on O. Define 4> (cr) = (f)(a.H). Note that 

L° H r H {v') = L H <l> H {<J.<j') 



this follows from 



{L N H r H ) (a') = J2e-^- H ^-°- H ^ + (<p((i.a').a.H) - <f>(a> '.a.H)) 



i=i 

N 



(5.4) 



= ^ e -/3[H( i .(^')-//(.V)] + (^.( a .a').H)-^{(a.a').H)) 
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Therefore, since cjf {a') = 4> (a.a'), we get 



{e tL -"^- H ) (a') = (e tL "<p H ) (a.a') 

Applying this last equality for a' = 1, we have proved that 

el H [cj>M] = e» [0(u/ t )] 
that is (5.3) holds for functions of one coordinate. 

To extend it to an arbitrary cylindrical function we have, assuming t\ < £2 

ef [0iK)&K)] = JE n a [MX tl .H)ct>2(Xt 2 .H)} 

= JE u a [MX tl .H)IE Xti [cj> 2 {X t2 _ tl .H)\] 



(5.5) 



(5.6) 



(5.7) 



where at the last step we have used that X t = is an homogeneous Markov process. 
Using (5.3), we have 



(5.8) 



using again (5.3) twice, we have also 



H 



IE 



0i (X tl .H)IE^ H [0 2 (X t2 _ tl .X tl .H)] 

! (X tl .a.H)IE^ a - H [0 2 (X t2 _ tl .X tl .a.H)] 
1 (X tl .a.H)IE^ [cf> 2 (X t2 _ tl .a.& 



= JE\ 



(5.9) 



H 



Using once again the Markov property for X t , we get 

IEl H [fa(X tl .<T.H)W%? {MXt 2 - tl .a.H)]\ 

= IE1 H [MX tl .cT.H)MXt 2 .cT.H)] 
= eT H [<j>iMMut 2 )} 



(5.10) 



Now it is easy to generalize what we just did to an arbitrary product of functions of one 
coordinate, then to cylindrical function and to measurable function by the monotone class 
theorem. This ends the proof of the lemma. ■ 

Note that uj t is the image of X t by the map X t — > X t .H. In general the image of a Markov 
process is not Markovian, however here we have the 
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Lemma 5.2 u t is an homogeneous Markov process. 
Proof: It is enough to prove that 



We have 



el H [0iK)fcK)] = K [MX tl .H)MX t2 .H)] 
Since X t is an homogeneous Markov process, we have 

P f [X tl = <r u X t2 = a 2 ) = IP ? [X tl = a x ] P J [X ta _ tl = a 2 ] 

Therefore we get 



H' 



The point is that using (5.3), we have 



^I {fTlM ; } Pf [X tl =a 1 ]lE% 1 [MX t2 - tl .H)] 
ef [^( Wt2 _ fl )]^l {ai .^ H;} ]P a H [X tl = ffl ] 



Therefore we get 

JE H a \^{X tl .R)^ 2 {X t2 .R)\ 

= J2MH[)ef [M«H>-t 1 )]'Z l *{* 1 .H=Hi}lP?[X tl 



= <ri 



H' 



IE 



H 



which is what we wanted to prove. 



Let Ttp be the Gibbs measure with Hamiltonian H i.e. 



e -f3H(a) 
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and let us define the probability on O by 

= E /MWw ( 5 - 17 ) 

when f :tl->IR. That is for all H' e 

v%{H') = £ ttJ (o-)I { h'=<,h } (5.18) 

We have the 

Lemma 5.3 For eac/i H eVL, is an invariant and reversible measure for u t 
Proof: 

The invariance follows from 

J2""(H')e? E < (^I^.^ef [0(u*)] 

= E [^(^)] 

aeSiv (5.19) 

uESn 

= v"{4>) 

where we have used the fact that 71^ is invariant for X t at the third step. 
The reversibility follows from 

E $ H {°)rf{°)lE* ty H (Xt)} = E 1> H {°)rf{°)E% [$ H {Xt)] (5-20) 

since is reversible for X t . This ends the proof of the lemma. ■ 

Now, for any bounded measurable function / defined on fi, we have, as t tends to +oo, 

e H a [f(u t )} = JE H a [f(X t H .H)] - if (/) (5.21) 

We are interested in estimating the speed on convergence in (5.21). A fundamental fact is 
stated in the following lemma 

Lemma 5.4 For any cp : Ojv — ► -K, Q [e^ ((p(ut))] is independent of a E Sn 

Proof: This follows from the fact that on the one hand, for all a G Sn and for all 
/ : O — > iR, we have 

e [/(#)] = e [/(*#)] (5-22) 
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since Q is invariant by any permutation of the configurations H. 
Therefore, using (5.4), we have, for all (p : — > IR 

Q [e n a (0)] = Q (</>)] = Q [ef (0)] 

which is what we wanted to prove. ■ 

Now we can define the following time: 



(5.23) 



T av (e) = inf <j t > si. sup sup Q [\e* (<p(u t )) - < e 



(5.24) 



here ||<^||oo = sup^efi 1^(^)1- T av (e) is the time such that the average over the medium of 
the medium as seen from the process is definitively within e of the reversible measure v^. 
The main result of this section is 

Theorem 5.5 For all e > 0, for all (3 < (3 C , 

limsup-J-logT^e) < 1 (5.25) 
Proof: Using Lemma 5.4, denoting by drj(x) the uniform measure on Sn, we get 

Q [|e?(y>M) - rf(<p)\] = I d V (a)Q [|e?fo>(u*)) - ^)|] (5.26) 



since the left hand side does not depends on a. Now using Tonelli's theorem we get 

M*)Q[\e?(<pM)-rf(<p)\]=Q\J d V {a)\e^{^ t ))-^^)\ 
Now, since 

using (5.17), we get 

Therefore 



e»{<p{u, t )) = m*{<p H {X t )) 



e%{<p{uH)) ~ v${<p) = IE* {ip H {X t )) - 

therefore collecting what we just did we get 

Q [\e?{<p{uH))-vf{<p)\] =Q\[ d V (a') \(P t H (<f H )) (a') -<(<^)| 



(5.27) 

(5.28) 
(5.29) 

(5.30) 



(5.31) 
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To continue, recalling Proposition 4.7 and 4.9, for all c, let A(c) be the subspace of O, of 
Q-probability bigger than 1 — 2e~ cN , which is the intersection of the two subspaces where 
we have the estimates (4.32) and (4.58). 
Then we get 



Q 



< 2Q [|b||ooI^c (c) ] + 



+ c p t-^ 2 -^Q[i^ (c) ||^|U(£^( P ))-^ 2 (/: H (p))- 



(5.32) 



p 2 /(4-2p) 



where the first part of the inequality follows from the fact that for all H e O and all t > 0, 
P/* is a contraction operator from L°°[0, iR] into itself and 7Tp is a probability measure. 

The second part follows from (2.11). We recall that C p = e~ p / 2 2 p / 2 ((2 - p)/p)~ p2 Z^' 2 ^ . 
Using now Proposition 4.7, Proposition 4.9 and ||</>||oo < 1, we get 

Q[|e?(¥>M)-^(¥>)|] <2e" cAr + 

+ c p t~ p/{2 ~ p \22) p2/ ^~ 2p \4) p/2 (N) 8p/4 ~ 2p (e 2pp ^ Cl{1+c)NlogN 

(5.33) 

From now on the proof is exactly the same as the proof of Theorem 4.1. 
At this point it is clear that we can also gives estimates that are similar to the ones given 
in Theorem 4.2 by using the same arguments as before and the computation done in the 
proof of Theorem 4.2. Let us state it Theorem. 

Theorem 5.6 For all N large enough, for all (3 < j3 c , There exists a constant c\ > 0, 
such that for all c > 0, there exists a constant Cq = Cq(c, (3) such that 



1 

N 



logT a „(e-^ 1/4 (^) 3/4 ) 



ci(l + c) logiV 



N 



1/2 



<(3 2 + 2 11 ( ° lv± \7 luSiV ) + ,-._,( I. c) I 1 \ + nJ^L\V* 



logiV 



N 



1/4 



(5.34) 



+ C (- 



N 



where 



c 2 (/?, c)=0 (l2^ cV ^{TTc}) 1/2 + \ ^±=L 
Moreover for all 5 > 

± logT av (N- s ) <(3 2 + (3 (nmVWTcj) V2 (^f/V 4 

^ Cl (l + c)logJV V /2 l /3 2 + /3 c 2 6 AogN 1/2 logiV 
+ Wc l N ) + 4 gft Vcid + c)^ +C0 ^ 



(5.35) 



(5.36) 



36 



VI. Statics estimates for the REM 



In this section we will give some estimates for the various constrained partition functions 
and partition functions on small spaces for the REM. These are just adaptations of similar 
estimates done in [4] section 4.2.1. 

Let us first prove Lemma 4.8. We denote by Z a {[3, > —d) = Zj-i({3, > —d)[zj,z > j]. Let 
M be as in Lemma 4.8, and make the partition of the real interval (— oo, d/V] with the 
intervals 



A 



R N ' 



if 1 < k < 4-M - 1 



Let 



N k = N k (z 3 ,z >j ) = Yl ^A k {-H{x,z J ,z >3 )) 

a:e{-l,+l}j-i 



(6.1) 



(6.2) 



(6.3) 



be the occupation number of the interval A k , it is easy to check that, if p k = IP [—H(x) G 
A fc ], then 

. /AT (k + 1) 2 ./AT b 2 

(6.4) 
(6.5) 



Pc V 2 "^^ <Pk< ^~M 2 
Using the exponential Markov inequality and optimizing we get 

P [N k > p k IE(N k )] < exp {-X k 2 aN } 

where 

p k = 2 ^[^#-] + +2 



and if p k p k > 1, X k = oo, while if < 1 

Afe = PfcPfc log log 

1 - pkPk 



l-Pk + 



pkPk(l-Pk) 
1 - PkPk 



(6.6) 



(6.7) 



It is not too long to check that Afc > pkPkCi for some positive constant c\, and also 
PkPk > 2 7V / m2 , therefore with our choice of M, we get 



P [N k > Pk IE(N k )} < exp - {c^^ 2 } 
< 2~ 2N exp -(cN) 
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(6.8) 



Note that the term 2 2N will be more than enough to get uniformity with respect to the 
index % for the chosen family of path, the index j, the configurations Zj, z > j, and the index 
k. 

Therefore, calling A = v 'aM 2 — 1 and D + 1 = dM//3 c and using (6.4) and (6.6), we get 



k=i 

V VN N0f3c jK^)M N/M 2 

^ M 

k=AAD+l 



(6.9) 



where the last sum is not present if D < A. 
We have 

N i 

2 j e^ l3c M < 2 J e /3/?c V Cl(1+c)7VlogAr (6.10) 



It is immediate to see that, if D < A 

^e N ^ (J ^2 N / M2 < c^e^ (6.11) 

k=AAD+l 

It remains to estimate the first sum in the right hand side of (6.9). Let us call it S(N), 
if we denote x = K/M, the maximum in the exponent occurs for x = (3/ (3 C . Therefore, if 
A < |Mwe easily get 



S(N) < VNe 1300 ^ 01 ( 1 + c ) Nl °s N e Npp c ^ 

, (6-12) 

< ^fNe ppc ^ Cl (1+c) N log N e Npd 



where at the last step we have used (5 c ^/a < (3 < d 
A 



If -f M < A < D we easily get 



S(N) < y /N e Mcy/c 1 (i+c)Nl og N e N(£-+ a £- ) (6 _ 13) 

If D < A, since d > (3, the maximum of the exponent occurs inside the interval of summa- 
tion therefore we easily get that 

S(N) < v/jVe^vMi+ C )Ano g N e JV(^+af ) (6 _ 14) 

collecting (6.10) to (6.14) we get (4.52) and (4.53). 

The Lemma 4.6 is proved in exactly the same way, by making a similar partition of 
[dN, +oo), for proving (4.26). Restricting the sum over k to just the one corresponding to 
k = M/3//3 c , it is easy to get (4.27). 
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